Learning Phonotactic Grammars from Surface Forms
نویسنده
چکیده
This paper presents an unsupervised batch learning algorithm for phonotactic grammars without a priori Optimality-theoretic (OT) constraints (Prince and Smolensky 1993, 2004). The underlying premise is that linguistic patterns (such as phonotactic patterns) have properties which reflect properties of the learner. In particular, this paper explores one way in which a learner benefits from considerations of locality; i.e. “the well-established generalization that linguistic [here: phonological] rules do not count beyond two” (Kenstowicz 1994:597).1 The formalization of locality adopted here leads to a novel, nontrivial hypothesis: all phonotactic patterns are neighborhood-distinct (to be defined in §5).
منابع مشابه
A Maximum Entropy Model of Phonotactics and
The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle o...
متن کاملA Maximum Entropy Model of Phonotactics and Phonotactic Learning
The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle o...
متن کاملGrammars leak: Modeling how phonotactic generalizations interact within the grammar
I present evidence from Navajo and English that weaker, gradient versions of morpheme-internal phonotactic constraints, such as the ban on geminate consonants in English, hold even across prosodic word boundaries. I argue that these lexical biases are the result of a MAXIMUM EN-TROPY phonotactic learning algorithm that maximizes the probability of the learning data, but that also contains a smo...
متن کاملA Probabilistic Ranking Learner for Phonotactics
1.2 The Input 1.2.1 Surface forms • Consists of surface phonetic forms to which the children may or may not have attached a meaning (so we can’t assume any sort of semantic bootstrapping, morphemic knowledge, etc). • We can’t assume knowledge of relationships among forms, and therefore we can’t assume knowledge of underlying forms. • This means positive data only—what forms exist—whereas phonot...
متن کاملImproving Syllabification Models with Phonotactic Knowledge
We report on a series of experiments with probabilistic context-free grammars predicting English and German syllable structure. The treebank-trained grammars are evaluated on a syllabification task. The grammar used by Müller (2002) serves as point of comparison. As she evaluates the grammar only for German, we reimplement the grammar and experiment with additional phonotactic features. Using b...
متن کامل